Offline Pre-trained Multi-agent Decision Transformer

نویسندگان

چکیده

Abstract Offline reinforcement learning leverages previously collected offline datasets to learn optimal policies with no necessity access the real environment. Such a paradigm is also desirable for multi-agent (MARL) tasks, given combinatorially increased interactions among agents and However, in MARL, of pre-training online fine-tuning has not been studied, nor even or benchmarks MARL research are available. In this paper, we facilitate by providing large-scale using them examine usage decision transformer context MARL. We investigate generalization following three aspects: 1) between single multiple agents, 2) from pretraining fine tuning, 3) that downstream tasks few-shot zero-shot capabilities. start introducing first dataset diverse quality levels based on StarCraftII environment, then propose novel architecture (MADT) effective learning. MADT transformer’s modelling ability sequence integrates it seamlessly both tasks. A significant benefit learns generalizable can transfer different types under task scenarios. On StarCraft II dataset, outperforms state-of-the-art (RL) baselines, including BCQ CQL. When applied pre-trained significantly improves sample efficiency enjoys strong performance few-short cases. To best our knowledge, work studies demonstrates effectiveness models terms generalizability enhancements

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدلسازی احساسات در سیستمهای multi-agent یادگیرنده

این پایان نامه به بررسی نقش مثبت یا منفی احساسات روی کارایی عامل های یادگیرنده در یک محیط multi-agent می پردازد. در این راستا مدلی برای عامل های یادگیرنده دارای احساس معرفی می شود. برای بررسی نقش احساسات، یک محیط فرضی multi-agent شبیه سازی شده و حالت های گوناگونی در آن نظر گرفته می شوند. در حالت نخست، کارایی عامل هایی بررسی می شود که دارای احساس نیستند و فقط قابلیت یادگیری دارند. در دومین حالت...

15 صفحه اول

Protection of power transformer using multi criteria decision-making

Power transformers are protected by different relays that operate independently. Malfunction of each relay has a major role in reducing the reliability of the protection system. In order to mitigate the main drawbacks of the power transformer relays, an overall protection scheme is presented in this paper. This scheme proposes a novel multi criterion algorithm using decision-making based on fuz...

متن کامل

Modal Consistency based Pre-Trained Multi-Model Reuse

Multi-Model Reuse is one of the prominent problems in Learnware [Zhou, 2016] framework, while the main issue of Multi-Model Reuse lies in the final prediction acquisition from the responses of multiple pre-trained models. Different from multiclassifiers ensemble, there are only pre-trained models rather than the whole training sets provided in Multi-Model Reuse configuration. This configuration...

متن کامل

Graph-based multi-agent decision making

Article history: Received 28 March 2011 Received in revised form 9 December 2011 Accepted 9 December 2011 Available online 14 December 2011

متن کامل

Trust Decision-Making in Multi-Agent Systems

Trust is crucial in dynamic multi-agent systems, where agents may frequently join and leave, and the structure of the society may often change. In these environments, it may be difficult for agents to form stable trust relationships necessary for confident interactions. Societies may break down when trust between agents is too low to motivate interactions. In such settings, agents should make d...

متن کامل

ذخیره در منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Machine Intelligence Research

سال: 2023

ISSN: ['2731-538X', '2731-5398']

DOI: https://doi.org/10.1007/s11633-022-1383-7